Maternal pre-pregnancy body mass index and risk of preterm birth: a collaboration using large routine health datasets

Background Preterm birth (PTB) is a leading cause of child morbidity and mortality. Evidence suggests an increased risk with both maternal underweight and obesity, with some studies suggesting underweight might be a greater factor in spontaneous PTB (SPTB) and that the relationship might vary by parity. Previous studies have largely explored established body mass index (BMI) categories. Our aim was to compare associations of maternal pre-pregnancy BMI with any PTB, SPTB and medically indicated PTB (MPTB) among nulliparous and parous women across populations with differing characteristics, and to identify the optimal BMI with lowest risk for these outcomes. Methods We used three UK datasets, two USA datasets and one each from South Australia, Norway and Denmark, together including just under 29 million pregnancies resulting in a live birth or stillbirth after 24 completed weeks gestation. Fractional polynomial multivariable logistic regression was used to examine the relationship of maternal BMI with any PTB, SPTB and MPTB, among nulliparous and parous women separately. The results were combined using a random effects meta-analysis. The estimated BMI at which risk was lowest was calculated via differentiation and a 95% confidence interval (CI) obtained using bootstrapping. Results We found non-linear associations between BMI and all three outcomes, across all datasets. The adjusted risk of any PTB and MPTB was elevated at both low and high BMIs, whereas the risk of SPTB was increased at lower levels of BMI but remained low or increased only slightly with higher BMI. In the meta-analysed data, the lowest risk of any PTB was at a BMI of 22.5 kg/m2 (95% CI 21.5, 23.5) among nulliparous women and 25.9 kg/m2 (95% CI 24.1, 31.7) among multiparous women, with values of 20.4 kg/m2 (20.0, 21.1) and 22.2 kg/m2 (21.1, 24.3), respectively, for MPTB; for SPTB, the risk remained roughly largely constant above a BMI of around 25–30 kg/m2 regardless of parity. Conclusions Consistency of findings across different populations, despite differences between them in terms of the time period covered, the BMI distribution, missing data and control for key confounders, suggests that severe under- and overweight may play a role in PTB risk. Supplementary Information The online version contains supplementary material available at 10.1186/s12916-023-03230-w.

health-related datasets.Birth outcome (live or stillborn), delivery details, gestational age at delivery, and maternal age were obtained from the Hospital Episode Statistics (HES) maternity data (Copyright© 2020, re-used with the permission of NHS Digital.All rights reserved); parity and birth interval from the pregnancy register; and BMI (recorded in the data as BMI itself or derived from recorded height and weight) and smoking from the primary care data.Gestational age in completed weeks is derived from ultrasound or -if not available -last menstrual period.We required BMI to be measured a maximum of 12 months pre-pregnancy and/or a maximum of 15 weeks gestation.CPRD has National Research Ethics Service Committee (NRES) approval for research using the primary care and linked datasets.Individuals registered with participating GP practices are included in the CPRD dataset unless they specifically opt out.The CPRD study protocol was approved by the Independent Scientific Advisory Committee (ISAC; protocol number: 20_145R).
South Australian Better Evidence Better Outcomes Linked Data (BEBOLD) platform Pregnancy data was obtained from the BEBOLD platform, which includes the South Australian Perinatal Statistics Collection 2007-2016.This is a mandatory collection of all births at least 400 grams or 20 weeks gestation.Maternal height and weight were reported at the first antenatal visit.Over 85% of women attend prior to 14 weeks pregnancy, with height and weight data not reported after 20 weeks gestation.Collection of height and weight data commenced in 2007, with a higher proportion of missing information in the first year of collection.Gestational age was determined from the first day of the last menstrual period if dates were reliable and early ultrasound (up to 20 weeks).A clinical examination could be used in the absence of these data or where there was uncertainty.Approval for the BEBOLD platform was obtained from the South Australian Department of Health's Human Research Ethics Committee, which included a waiver of individual consent for the use of de-identified administrative data.
US National Center for Health Statistics vital statistics (birth registration) data US states are required to record births and deaths via certificates and Federal law mandates national collection and publication of these data.These data are compiled by the National Center for Health Statistics (NHCS), anonymised and made publicly available; they form the National Vital Statistics System.BMI was included in the fetal death data files from 2014 onwards and, at the time of analysis, data on fetal deaths were available up to the end of 2019.Each record in these datasets relates to a live birth or fetal death, rather than a pregnancy and there are no pregnancy or personlevel identifiers in the dataset.To identify pregnancies we matched multiple births occurring close in time in which birth and maternal characteristics were the same.Gestational age at delivery was based on routine ultrasound measurements or last menstrual period for the small proportion (<1%) with no ultrasound-based data.Maternal pre-pregnancy weight and height were self-reported by the women at the time of birth.(When a mother registers a birth, she is required to complete a formcalled the Mother's Worksheet -https://www.cdc.gov/nchs/data/dvs/moms-worksheet-2016.pdf.Information needed for the birth certificate is collected, as is additional socio-demographic -race, education level, marital status and so on -as well as other data, including smoking, height and prepregnancy weight.)These publicly available datasets are anonymised.
Welsh linked data: Secure Anonymised Information Linkage (SAIL) Databank Linkable datasets from the SAIL Databank are made available for approved analyses via a secure research environment, the UK Secure Research Platform (SeRP).The datasets used for the current study were primary care records, the National Community Child Health (NCCH) Database NCCHD (birth registration plus child health and immunisation data), the Welsh Demographic Service Dataset (demographic characteristics of individuals registered with a GP in Wales) and the Maternity Indicators Dataset (MID) (data -from 2014 onwards -on women from their first antenatal assessment together with data on labour and birth).The MID and the NCCH datasets provided all variables except socio-economic position (Index of Multiple Deprivation (IMD) 2014, which came from the Welsh Demographic Service Dataset, and BMI, which either came from the MID (derived from recorded height and weight) or the primary care data (recorded as BMI itself or derived from recorded height and weight).BMI was from the MID for 77% of pregnancies and from the GP data for the remainder.If BMI came from the primary care data, we required it to be measured a maximum of 12 months pre-pregnancy and/or a maximum of 15 weeks gestation.Gestational age at delivery is based on early ultrasound if available or -if not available -last menstrual period.Details regarding ethics and consent have been described previously; individuals are able to opt out of their data being transferred to SAIL.

Bradford maternity data (UK)
The Bradford Royal Infirmary is a large teaching hospital in Bradford, England, operated by the Bradford Teaching Hospitals NHS Foundation Trust.Gestational age at delivery was based on early ultrasound if available or -if not available -last menstrual period.BMI was derived from height and weight, which were measured by clinical staff at the first antenatal appointment.All data were anonymised and therefore patient consent and ethical approval was not required.

Availability of confounders
Socio-economic position (SEP) was measured differently across the datasets: years of education, index of multiple deprivation (IMD), Townsend score (another area-based deprivation index), and occupation.Ethnicity is not recorded in the Danish or Norwegian registries; in the Danish data, country of origin was included instead.We were able to adjust for either birth or pregnancy interval in all datasets except the Bradford maternity data; where a dataset had both birth and pregnancy interval, we used the most complete.Depending on the dataset, smoking was recorded as either (i) current smoker/non-smoker, (ii) ever/never smoked, or (iii) smoking during pregnancy, with information collected in each trimester (see Supplementary Tables S1-S8).Maternal age (in years) and parity (total number of previous births) were available and categorised in the same way in all datasets (maternal age categorised as: <25, 25-29, 30-34, 35-39, and 40+ years, although presented as mean (SD) years in Supplementary Tables S1-S8; parity was categorised as 0, 1, 2, 3, 4+).

Supplementary text A.2: Further details of statistical methods a) Fractional polynomial models
To fit the fractional polynomials, BMI was first scaled [scaled BMI = (BMI-10)/5].This is done because if the values of the variable are too large (or too small), this can generate extreme values with certain powers of this variable (e.g.cubic or squared reciprocal powers).Supplementary Table S10 gives the deviance for the different two-degree and three-degree fractional polynomials for each outcome in each dataset.Model fit was assessed using the change in deviance.Since the three-degree models fit better in all datasets except Connected Bradford and CPP and because there was more consistency in terms of the best-fitting three power models, we selected the optimal model from among those with three powers of scaled BMI.For all three outcomes -any preterm birth (PTB), spontaneous preterm birth (SPTB) and medically indicated PTB (MPTB), the optimal model had terms of scaled BMI of -2, -2, -2 (i.e.1/scaled BMI 2 , ln(scaled BMI) x 1/scaled BMI 2 and ln(scaled BMI) x ln(scaled BMI) x 1/scaled BMI 2 ).For any preterm birth, this polynomial was the best fitting model in three datasets and the second-best fitting in three; for SPTB it was the best fitting model in six of the eight datasets; and for medically indicated PTB it was the best fitting model in three datasets and the second-best in three.
The study-specific estimates of the constant, α, and fractional polynomial terms,  1 ,  2 ,  3 , were used to plot the predicted risk of PTB against BMI for individuals in the reference category of all confounders for each study.
We used multivariate random effects meta-analysis to pool the fractional polynomial terms (i.e. the s, the regression coefficients for the powers of sBMI) and the constant term, , and then used the pooled estimates to plot the predicted (pooled) risk of PTB against BMI.
c) Obtaining the estimated BMI at which the risk of PTB is lowest These were obtained by differentiating the function given in equation [1] with respect to (scaled) BMI -minimum points occur where the value of this differential is equal to zero.
To calculate confidence intervals, the estimates of α,  1 ,  2 , and  3 , together with their variance covariance matrix were used to generate 100,000 bootstrapped samples of these coefficients using the drawnorm function in Stata and the value of BMI at which the minimum point occurred was calculated for each sample.The confidence limits were obtained by taking the 2.5 th and 97.5 th percentiles of these.

Table S10 :
: Characteristics of the whole sample, complete cases and excluded cases: Univariate risk of any preterm, spontaneous preterm and medically indicated preterm birth by BMI category -parous women Best fitting three-degree model; *Best fitting (two/three-degree) model among those displayed.The best fitting models in CPP and Connected Bradford were not necessarily the best fitting in any other dataset.
1. BMI ≥35 kg/m 2 as too few individuals with BMI of 40 or higher in this dataset 2. Exact number suppressed for disclosure control purposes 3. BMI <25 kg/m 2 -groups combined for disclosure control purposes Supplementary 1. BMI ≥35 kg/m 2 as too few individuals with BMI of 40 or higher in this dataset 2. BMI ≥35 kg/m 2 -groups combined for disclosure control purposes 3. BMI <25 kg/m 2 -groups combined for disclosure control purposes Supplementary TableS11: Deviance from different fractional polynomials for each dataset -nulliparous women Polynomial terms (two-degree models) Polynomial terms (three-degree models)a Best fitting two-degree model; b Best fitting three-degree model; *Best fitting (two/three-degree) model among those displayed.The best fitting models in CPP and Connected Bradford were not necessarily the best fitting in any other dataset.However, the deviance for the models displayed were very similar (or the same to the nearest whole number) to the deviance for the best-fitting model.CPP: Any PTB 11,373 (two-degree); SPTB 10,543 (three-degree); Connected Bradford: Any PTB 871 (three-degree); SPTB 604 (two-degree), 603 (three-degree); MPTB 436 (two-degree).

Table S14 :
Random effects meta-analysis: results are the terms (and their standard errors) for each dataset, the pooled result, and the weights (sBMI=scaled BMI) -parous women

Table S15 :
Adjusted odds ratios (95% CI) of any preterm, spontaneous preterm and medically indicated preterm birth by BMI category -nulliparous women

Table S16 :
Adjusted odds ratios (95% CI) of any preterm, spontaneous preterm and medically indicated preterm birth by BMI category -parous women

Table S17 :
Adjusted odds ratios (95% CI) for any very (<32 completed weeks) preterm, spontaneous very preterm and medically indicated very preterm birth by BMI category -nulliparous women

Table S18 :
Adjusted odds ratios (95% CI) for any very (<32 completed weeks) preterm, spontaneous very preterm and medically indicated very preterm birth by BMI category -parous women . Weighting SPTB by the inverse of one minus the probability of being a medically indicated preterm birth and weighting medically indicated preterm birth by the inverse of one minus the probability of being a SPTB 2. BMI ≥35 kg/m 2 as too few individuals with BMI of 40 or higher in this dataset Supplementary TableS21: Adjusted odds ratios (95% CI) for any preterm, spontaneous preterm and medically indicated preterm birth by BMI category excluding stillbirths -nulliparous women 1 : Adjusted odds ratios (95% CI) for any preterm, spontaneous preterm and medically indicated preterm birth by BMI category excluding stillbirths -parous women

Table S23 :
Adjusted odds ratios (95% CI) for any preterm, spontaneous preterm and medically indicated preterm birth by BMI category excluding post term births -nulliparous women

Table S24 :
Adjusted odds ratios (95% CI) for any preterm, spontaneous preterm and medically indicated preterm birth by BMI category excluding post term births -parous women

Table S25 :
Adjusted odds ratios (95% CI) for any preterm, spontaneous preterm and medically indicated preterm birth by BMI category excluding multiple births -nulliparous women BMI ≥35 kg/m 2 as too few individuals with BMI>40 in this dataset 32 Supplementary TableS26: Adjusted odds ratios (95% CI) for any preterm, spontaneous preterm and medically indicated preterm birth by BMI category excluding multiple births -parous women 1.